# Causal Discovery in time-series data
The folder contains the R scripts for the example in section 3.2.4.

## Main files 

We have two main files:

* [PCMCI.py](https://github.com/AnaRitaNogueira/Methods-and-Tools-for-Causal-Discovery-and-Causal-Inference/blob/main/3.%20Causal%20Discovery/3.2%20Finding%20causal%20relations%20in%20time-series%20data/PCMCI.py), which refers to the code to replicate the results provided in Figure 9(b) (PCMCI). 
* [evaluation.R](https://github.com/AnaRitaNogueira/Methods-and-Tools-for-Causal-Discovery-and-Causal-Inference/blob/main/3.%20Causal%20Discovery/3.2%20Finding%20causal%20relations%20in%20time-series%20data/evaluation.R), which refers to the code to replicate the results provided in Code 1 (granger causality) and Figure 9(a) (tsFCI). The file also contains code examples needed to perform a comparision between the methods. 

## Auxiliary files
 We have two auxiliary files:
  * [ChickEgg unfolded.csv](https://github.com/AnaRitaNogueira/Methods-and-Tools-for-Causal-Discovery-and-Causal-Inference/blob/main/3.%20Causal%20Discovery/3.2%20Finding%20causal%20relations%20in%20time-series%20data/ChickEgg%20unfolded.csv), which refers ChickEgg train set, used in the example.
  * [ChickEgg_test unfolded.csv](https://github.com/AnaRitaNogueira/Methods-and-Tools-for-Causal-Discovery-and-Causal-Inference/blob/main/3.%20Causal%20Discovery/3.2%20Finding%20causal%20relations%20in%20time-series%20data/ChickEgg_test%20unfolded.csv), which refers ChickEgg test set, used in the example.


We also add here relevant links to common tools to create causal discovery models, if different techniques need to be tested. Finally, we provide a list of available time-series datasets.

## Tools for time-series data
| Software | Author | Original Paper | Language | Original Source | Keywords |
| --- | --- | --- | --- | --- | --- |
| Amortized Causal Discovery | Sindy Löwe, David Madras, Richard Zemel, Max Welling | [Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data](https://arxiv.org/abs/2006.10833)| Python | [Git Hub](https://github.com/loeweX/AmortizedCausalDiscovery) | NA |
| TCDF: Temporal Causal Discovery Framework | Meike Nauta, Doina Bucur and Christin Seifert | [Causal Discovery with Attention-Based Convolutional Neural Networks](https://www.mdpi.com/2504-4990/1/1/19) | Python | [Git Hub](https://github.com/M-Nauta/TCDF) | convolutional neural network; time series; causal discovery; attention; machine learning |
|Tigramite| Jakob Runge |NA | Python | [Git Hub](https://github.com/jakobrunge/tigramite) | Causal discovery; time series|

## Public Available Data Sets

The data sets presented in the following table are publicly available in several online repositories, and have been used in several published works related to causal discovery.
| Paper | Data sets |
|---|---|
| [Huang, Yuxiao, and Samantha Kleinberg. &quot;Fast and accurate causal inference from time series data.&quot; The Twenty-Eighth International Flairs Conference. 2015.](https://d1wqtxts1xzle7.cloudfront.net/37392191/huang_flairs15.pdf?1429716585=&amp;response-content-disposition=inline%3B+filename%3DFast_and_Accurate_Causal_Inference_from.pdf&amp;Expires=1611255529&amp;Signature=TOeW7o3RDjwLy6qwurN~LLNYD31A-VhPVosR8yIgo90EwU6oO~VeUbLqEtdZP3xvkuLkHiDx5s87Lj3-fat1~NRwr7VM2NjHEo4l8P2mi9kQ62uVw79h3bvLZhpcYAI3ynMNe6f9zkpHFjvg7DDgz0ofxBao8MNz0arjuwz9Ud~gNQjGb3z3lznuuyr96VDyMyBQIBDUtC82aFGWgG-hzFk1yF~c8v50MjjeMFgns-a6Q7d9U6pd0Xyzio~2HJmpFoTIVfaT3Kk4Nd59b0Zm5~Y4H4Vsmvm0b40-HUWzKKZZ~9HbJy~wMKxyX3pO5zWh0zv1kyc29ticRWKZkG--8Q__&amp;Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA) | [FLAIRS](http://www.skleinberg.org/data.html) |
| [Kleinberg, Samantha. Causality, probability, and time. Cambridge University Press, 2013.](https://books.google.pt/books?hl=pt-PT&amp;lr=&amp;id=KHwqL43SaZQC&amp;oi=fnd&amp;pg=PR7&amp;dq=Causality,+Probability,+and+Time&amp;ots=Lff-d7vZz9&amp;sig=6_C-PCQqpVGaOP0nJALQhTyUEWg&amp;redir_esc=y#v=onepage&amp;q=Causality%2C%20Probability%2C%20and%20Time&amp;f=false) | [FinanceCPT](http://www.skleinberg.org/data.html) |
|[S. A. Rahman, C. Merck, Yuxiao Huang and S. Kleinberg, "Unintrusive eating recognition using Google Glass," 2015 9th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth), Istanbul, 2015, pp. 108-111, doi: 10.4108/icst.pervasivehealth.2015.259044.](https://ieeexplore.ieee.org/document/7349385)|[GLEAM dataset](http://www.skleinberg.org/data/GLEAM.tar.gz)<sup>[1](#myfootnote1.1)</sup><sup>[2](#myfootnote1.2)</sup>|

*<sub>
<a name="myfootnote1.1">1</a>: "Use of the data is permitted for non-commercial research and education purposes provided you properly credit the data source"<br>
 <a name="myfootnote1.2">3</a>: "Use of the data is permitted for non-commercial research and education purposes provided do not redistribute the data (with or without modification)"
  <suv/>*

## Data sets Available Throught Request

In this section, we present several data sets that are available through resquest (for example, filling a form).
| Paper | Data sets |
|---|---|
|[Mark Mirtchouk, Drew Lustig, Alexandra Smith, Ivan Ching, Min Zheng, and Samantha Kleinberg. 2017. Recognizing Eating from Body-Worn Sensors: Combining Free-living and Laboratory Data. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 3, Article 85 (September 2017), 20 pages. DOI:https://doi.org/10.1145/3131894](https://dl.acm.org/doi/10.1145/3131894)|[ACE Free-living dataset](http://skleinberg.org/data/ACE-FL.html)<sup>[1](#myfootnot1)</sup><sup>[2](#myfootnote2)</sup><sup>[3](#myfootnote3)</sup>|
|[Christopher Merck, Christina Maher, Mark Mirtchouk, Min Zheng, Yuxiao Huang, and Samantha Kleinberg. 2016. Multimodality sensing for eating recognition. In Proceedings of the 10th EAI International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth '16). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), Brussels, BEL, 130–137.](https://dl.acm.org/doi/10.5555/3021319.3021339)<br>[Mark Mirtchouk, Christopher Merck, and Samantha Kleinberg. 2016. Automated estimation of food type and amount consumed from body-worn audio and motion sensors. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '16). Association for Computing Machinery, New York, NY, USA, 451–462. DOI:https://doi.org/10.1145/2971648.2971677](https://dl.acm.org/doi/10.1145/2971648.2971677)|[ACE dataset (lab)](http://skleinberg.org/data/ACE.html)<sup>[1](#myfootnote2.1)</sup><sup>[2](#myfootnote2.2)</sup><sup>[3](#myfootnote2.3)</sup>|

*<sub>
<a name="myfootnote2.1">1</a>: "Use of the data is permitted for non-commercial research and education purposes provided you properly credit the data source"<br>
<a name="myfootnote2.2">2</a>: "Use of the data is permitted for non-commercial research and education purposes provided you do not attempt to identify participants in the study"<br>
<a name="myfootnote2.3">3</a>: "Use of the data is permitted for non-commercial research and education purposes provided do not redistribute the data (with or without modification)"
<sub/>*
  
  
  ## Other causal discovery data sets
  These data sets are identified in the various sources as causal discovery data sets. However, they are classified as causal discovery data sets in various sources.

 Data set |Source Location|
|---|---|
[Horton General Hospital Data Set|[UCI Repository](https://archive.ics.uci.edu/ml/datasets/Gas+Sensor+Array+Drift+Dataset+at+Different+Concentrations)|
UbiqLog|[UCI Repository](https://archive.ics.uci.edu/ml/datasets/UbiqLog+(smartphone+lifelogging))|
|PROMO||[Causality Workbench](http://clopinet.com/causality/data/promo/)
[Pedestrian in Traffic Dataset Data Set<sup>[2](#myfootnote3.2)</sup>|[UCI Repository](https://archive.ics.uci.edu/ml/datasets/Pedestrian+in+Traffic+Dataset)|
|Dunnhumby - The Complete Journey |  [Kaggle](https://www.kaggle.com/frtgnn/dunnhumby-the-complete-journey?select=causal_data.csv) | 
| Figure Eight: Medical Sentence Summary |[Kaggle](https://www.kaggle.com/kmader/figure-eight-medical-sentence-summary?select=validation.csv) |
|Temperature and cherry blossom status | [Kaggle](https://www.kaggle.com/akioonodera/temperature-and-flower-status)|
|Historical Gold Stock-GLD (EFT) | [Kaggle](https://www.kaggle.com/kalpanadontha/historicalgoldstockrandgoldresources) |
|Medical Information Extraction |[Kaggle](https://www.kaggle.com/mathurinache/medical-information-extraction)|
|Influenza and Env Factors | [Kaggle](https://www.kaggle.com/ffejgnaw/weekly-influenza-and-env-factors-2016-2018)|
|National Footprint Accounts 2018 | [Kaggle](https://www.kaggle.com/footprintnetwork/national-footprint-accounts-2018)|
|School Shootings US 1990-present | [Kaggle](https://www.kaggle.com/ecodan/school-shootings-us-1990present?select=cps_01_formatted.csv)|


*<sub>
<a name="myfootnote3.1">1</a>: "Citation of both papers is required:<br>
A Vergara, S Vembu, T Ayhan, M Ryan, M Homer, R Huerta. "Chemical gas sensor drift compensation using classifier ensembles." Sensors and Actuators B: Chemical 166 (2012): 320-329.;
I Rodriguez-Lujan, J Fonollosa, A Vergara, M Homer, R Huerta. "On the calibration of sensor arrays for pattern recognition using the minimal number of experiments." Chemometrics and Intelligent Laboratory Systems 130 (2014): 123-134."<br>
<a name="myfootnote3.2">2</a>: "You may use this data for scientific, non-commercial purposes, as long as you give credit to the owners when publishing any work based on this data. Please cite Blaiotta, Claudia. 'Learning generative socially-aware models of pedestrian motion.' IEEE Robotics and Automation Letters, 2019."
<sub/>*

